Parmenides: An Opportunity For ISO TC37 SC4?
نویسندگان
چکیده
Despite the many initiatives in recent years aimed at creating Language Engineering standards, it is often the case that di erent projects use di erent approaches and often de ne their own standards. Even within the same project it often happens that di erent tools will require di erent ways to represent their linguistic data. In a recently started EU project focusing on the integration of Information Extraction and Data Mining techniques, we aim at avoiding the problem of incompatibility among di erent tools by de ning a Common Annotation Scheme internal to the project. However, when the project was started (Sep 2002) we were unaware of the standardization e ort of ISO TC37/SC4, and so we commenced once again trying to de ne our own schema. Fortunately, as this work is still at an early stage (the project will last till 2005) it is still possible to redirect it in a way that it will be compatible with the standardization work of ISO. In this paper we describe the status of the work in the project and explore possible synergies with the work in ISO TC37 SC4. 1 Institute of Computational Linguistics, University of Zurich, Switzerland; Biovista, Athens, Greece; Centre for Research in Information Management, UMIST, Manchester, UK; CNRS, Paris, France; Unilever Research and Development, Vlaardingen, The Netherlands; TIM/ISSCO, University of Geneva, Switzerland; Uni Magdeburg, Germany; Wordmap Ltd., Bath, UK; Neurosoft, Athens, Greece; The Greek Ministry of National Defense, Athens, Greece
منابع مشابه
A Framework for Standardized Syntactic Annotation
We present in this poster actual work on the building of a standard for syntactic annotation in the framework of ISO TC37/SC4. We describe here mainly the meta-model for syntactic annotation, which is building on the actual ISO proposal for a standard for morpho-syntactic annotation (MAF) and which is embedded in running efforts for defining a generic linguistic annotation
متن کاملOWL/DL formalization of the MULTEXT-East morphosyntactic specifications
This paper describes the modeling of the morphosyntactic annotations of the MULTEXT-East corpora and lexicons as an OWL/DL ontology. Formalizing annotation schemes in OWL/DL has the advantages of enabling formally specifying interrelationships between the various features and making logical inferences based on the relationships between them. We show that this approach provides us with a top-dow...
متن کاملTowards standards for corpus query: Work on a Lingua Franca for corpus query
In this presentation, we report about the ongoing work on the development of a standard for corpus query languages. This work takes place in the context of the ISO TC37/SC4 WG6 activity on the suggested work item proposal „Corpus Query Lingua Franca“ (Bański and Witt, 2011). We have collected a set of requirements on a corpus query language motivated by the needs of linguists and we will presen...
متن کاملAn API for accessing the Data Category Registry
Central Ontologies are increasingly important to manage interoperability between different types of language resources. This was the reason for ISO to set up a new committee ISO TC37/SC4 taking care of language resource management issues. Central to the work of this committee is the definition of a framework for a central registry of data categories that are important in the domain of language ...
متن کاملThe Linguistic Annotation Framework: a standard for annotation interchange and merging
This paper overviews the International Standards Organization Linguistic Annotation Framework (ISO LAF) developed in ISO TC37 SC4. We describe the XML serialization of ISO LAF, the Graph Annotation Format (GrAF) and discuss the rationale behind the various decisions that were made in determining the standard. We describe the structure of the GrAF headers in detail and provide multiple examples ...
متن کامل